<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<head>
<!-- Copyright 1997 The Open Group, All Rights Reserved -->
<title>sort</title>
</head><body bgcolor=white>
<center>
<font size=2>
The Single UNIX &reg; Specification, Version 2<br>
Copyright &copy; 1997 The Open Group

</font></center><hr size=2 noshade>
<h4><a name = "tag_001_014_2030">&nbsp;</a>NAME</h4><blockquote>
sort - sort, merge or sequence check text files
</blockquote><h4><a name = "tag_001_014_2031">&nbsp;</a>SYNOPSIS</h4><blockquote>
<pre><code>

sort <b>[</b>-m<b>][</b>-o <i>output</i><b>][</b>-bdfinru<b>][</b>-t <i>char</i><b>][</b>-k <i>keydef</i><b>]</b>...<b>[</b>-z <i>recsz</i><b>]
[</b><i>file</i>...<b>]</b>

sort -c <b>[</b>-bdfinru<b>][</b>-t <i>char</i><b>][</b>-k <i>keydef</i><b>]</b>...<b>[</b>-z <i>recsz</i><b>][</b><i>file</i>...<b>]</b>

sort <b>[</b>-mu<b>][</b>-o <i>output</i><b>][</b>-bdfir<b>][</b>-t <i>char</i><b>][</b>+<i>pos1</i><b>[</b>-<i>pos2</i><b>]]</b>...<b>[</b>-z <i>recsz</i><b>]
[</b><i>file</i>...<b>]</b>

sort -c<b>[</b>-u<b>][</b>-bdfinr<b>][</b>-t <i>char</i><b>][</b>+<i>pos1</i><b>[</b>-<i>pos2</i><b>]]</b>...<b>[</b>-z <i>recsz</i><b>][</b><i>file</i><b>]</b>
</code>
</pre>
</blockquote><h4><a name = "tag_001_014_2032">&nbsp;</a>DESCRIPTION</h4><blockquote>
The
<i>sort</i>
utility performs one of the following functions:
<ol>
<p>
<li>
Sorts lines of all the named files together
and writes the result to the specified output.
<p>
<li>
Merges lines of all the named (presorted) files together
and writes the result to the specified output.
<p>
<li>
Checks that a single input file is correctly presorted.
<p>
</ol>
<p>
Comparisons are based on one or more
sort keys extracted from each line of input
(or the entire line if no sort keys are specified),
and are performed using the collating sequence of the current locale.
</blockquote><h4><a name = "tag_001_014_2033">&nbsp;</a>OPTIONS</h4><blockquote>
The
<i>sort</i>
utility supports the <b>XBD</b> specification, <a href="../xbd/utilconv.html#usg"><b>Utility Syntax Guidelines</b>&nbsp;</a> ,
except that the notation
<i>+pos1</i>
<i>-pos2</i>
uses a non-standard prefix and
multi-digit option names in the obsolescent versions, the
<b>-o</b>&nbsp;<i>output</i>
option is recognised after a
<i>file</i>
operand as an obsolescent
feature in both versions where the
<b>-c</b>
option is not specified,
and the
<b>-k</b>&nbsp;<i>keydef</i>
option should follow the
<b>-b</b>,
<b>-d</b>,
<b>-f</b>,
<b>-i</b>,
<b>-n</b>
and
<b>-r</b>
options.
<p>
The following options are supported:
<dl compact>

<dt><b>-c</b>
<dd>Check that the single input file is ordered as specified by
the arguments and the
collating sequence of the current locale.
No output is produced; only the exit code is affected.

<dt><b>-m</b>
<dd>Merge only;
the input file is assumed to be already sorted.

<dt><b>-o&nbsp;</b><i>output</i>
<dd>
Specify the name of an output file
to be used instead of the standard output.
This file can be the same as one of the input
<i>file</i>s.

<dt><b>-u</b>
<dd>Unique: suppress all but one in each
set of lines having equal keys.
If used with the
<b>-c</b>
option, check that there are no lines with duplicate keys,
in addition to checking that the input file is sorted.

<dt><b>-z&nbsp;</b><i>recsz</i>
<dd>
The size of the longest line read
in the sort phase is recorded so that buffers of the correct size
can be allocated during the merge phase.
If the sort phase is omitted via the
<b>-c</b>
or
<b>-m</b>
options, a system-dependent default size will be used.
Lines longer than the buffer size will cause
<i>sort</i>
to terminate abnormally.
Supplying the actual number of bytes in the longest line
to be merged (or some larger value)
will prevent abnormal termination.

</dl>
<p>
The following options override the default ordering rules.
When ordering options appear independent of
any key field specifications, the requested field ordering rules are
applied globally to all sort keys.
When attached to a specific key (see
<b>-k</b>),
the specified ordering options override all global ordering options
for that key.
In the obsolescent forms, if one or more of these options follows a
<i>+pos1</i>
option, it will affect only the key field specified by that preceding option.
<dl compact>

<dt><b>-d</b>
<dd>Specify that only blank characters
and alphanumeric characters, according to the current setting of
LC_CTYPE, are significant in comparisons.
The behaviour is undefined for a sort key to which
<b>-i</b>
or
<b>-n</b>
also applies.

<dt><b>-f</b>
<dd>Consider all lower-case characters that have upper-case
equivalents, according to the current setting of LC_CTYPE,
to be the upper-case equivalent for the purposes of comparison.

<dt><b>-i</b>
<dd>Ignore all characters that are non-printable,
according to the current setting of LC_CTYPE.

<dt><b>-n</b>
<dd>Restrict the sort key to
an initial numeric string,
consisting of optional
blank characters,
optional minus sign,
and zero or more digits with an optional
radix character and thousands separators
(as defined in the current locale),
which will be sorted by arithmetic value.
An empty digit string is treated as zero.
Leading zeros and signs on zeros do not affect ordering.

<dt><b>-r</b>
<dd>Reverse the sense of comparisons.

</dl>
<p>
The treatment of field separators can be altered using the options:
<dl compact>

<dt><b>-b</b>
<dd>Ignore leading
blank characters
when determining the starting and ending
positions of a restricted sort key.
If the
<b>-b</b>
option is specified before the first
<b>-k</b>
option, it is applied to all
<b>-k</b>
options.
Otherwise, the
<b>-b</b>
option can be attached independently to each
<b>-k</b>&nbsp;<i>field_start</i>
or
<i>field_end</i>
option-argument (see below).

<dt><b>-t&nbsp;</b><i>char</i>
<dd>
Use
<i>char</i>
as the field separator character;
<i>char</i>
is not considered to be part of a field
(although it can be included in a sort key).
Each occurrence of
<i>char</i>
is significant (for example,
<i>&lt;char&gt;&lt;char&gt;</i>
delimits an empty field).
If
<b>-t</b>
is not specified,
blank
characters are used as default field separators;
each maximal non-empty sequence of blank characters that follows a
non-blank character is a field separator.

</dl>
<p>
Sort keys can be specified using the options:
<dl compact>

<dt><b>-k&nbsp;</b><i>keydef</i>
<dd>
The
<i>keydef</i>
argument is a restricted sort key field definition.
The format of this definition is:
<pre>
<code>
<i>field_start</i><b>[</b><i>type</i><b>][</b>,<i>field_end</i><b>[</b><i>type</i><b>]]
</b></code>
</pre>
where
<i>field_start</i>
and
<i>field_end</i>
define a key field restricted to a portion of the line (see
the EXTENDED DESCRIPTION section),
and
<i>type</i>
is a modifier from the list of characters
b,
d,
f,
i,
n,
r.
The
b
modifier behaves like the
<b>-b</b>
option, but applies only to the
<i>field_start</i>
or
<i>field_end</i>
to which it is attached.
The other modifiers
behave like the corresponding options, but apply
only to the key field to which they are attached; they
have this effect if specified with
<i>field_start,</i>
<i>field_end</i>
or both.
If any modifier is attached to a
<i>field_start</i>
or to a
<i>field_end</i>,
no option applies to either.
Implementations support at least nine occurrences of the
<b>-k</b>
option, which are significant in command line order.
If no
<b>-k</b>
option is specified, a
default sort key of the entire line is used.

When there are multiple key fields, later keys
are compared only after all earlier keys compare equal.
Except when the
<b>-u</b>
option is specified,
lines that otherwise compare equal are ordered
as if none of the options
<b>-d</b>,
<b>-f</b>,
<b>-i</b>,
<b>-n</b>
or
<b>-k</b>
were present (but with
<b>-r</b>
still in effect, if it was specified) and
with all bytes in the lines significant to the comparison.
The order in which lines that still compare equal
are written is unspecified.

<dt><i>+pos1</i><dd>Specify the start position of a key field.
See the EXTENDED DESCRIPTION section.

<dt><i>-pos2</i><dd>Specify the end position of a key field.
See the EXTENDED DESCRIPTION section.

</dl>
</blockquote><h4><a name = "tag_001_014_2034">&nbsp;</a>OPERANDS</h4><blockquote>
The following operand is supported:
<dl compact>

<dt><i>file</i><dd>A pathname of a file to be sorted, merged or checked.
If no
<i>file</i>
operands are specified,
or if a
<i>file</i>
operand is "-", the standard input will be used.

</dl>
</blockquote><h4><a name = "tag_001_014_2035">&nbsp;</a>STDIN</h4><blockquote>
The standard input will be used only if no
<i>file</i>
operands are specified,
or if a
<i>file</i>
operand is "-".
See the INPUT FILES section.
</blockquote><h4><a name = "tag_001_014_2036">&nbsp;</a>INPUT FILES</h4><blockquote>
The input files must be text files,
except that the
<i>sort</i>
utility will add a
newline character
to the end of a file ending
with an incomplete last line.
</blockquote><h4><a name = "tag_001_014_2037">&nbsp;</a>ENVIRONMENT VARIABLES</h4><blockquote>
The following environment variables affect the execution of
<i>sort</i>:
<dl compact>

<dt><i>LANG</i><dd>Provide a default value for the internationalisation variables
that are unset or null.
If
<i>LANG</i>
is unset or null, the corresponding value from the
implementation-dependent default locale will be used.
If any of the internationalisation variables contains an invalid setting, the
utility will behave as if none of the variables had been defined.

<dt><i>LC_ALL</i><dd>
If set to a non-empty string value,
override the values of all the other internationalisation variables.

<dt><i>LC_COLLATE</i><dd>
Determine the locale for ordering rules.

<dt><i>LC_CTYPE</i><dd>
Determine the
locale for the interpretation of sequences of bytes of text data as
characters (for example, single-
versus multi-byte characters in arguments and input files)
and the behaviour of character classification for the
<b>-b</b>,
<b>-d</b>,
<b>-f</b>,
<b>-i</b>
and
<b>-n</b>
options.

<dt><i>LC_MESSAGES</i><dd>
Determine the locale that should be used to affect
the format and contents of diagnostic
messages written to standard error.

<dt><i>LC_NUMERIC</i><dd>
Determine the locale for the definition of the
radix character and thousands separator for the
<b>-n</b>
option.

<dt><i>NLSPATH</i><dd>
Determine the location of message catalogues
for the processing of
<i>LC_MESSAGES .
</i>
</dl>
</blockquote><h4><a name = "tag_001_014_2038">&nbsp;</a>ASYNCHRONOUS EVENTS</h4><blockquote>
Default.
</blockquote><h4><a name = "tag_001_014_2039">&nbsp;</a>STDOUT</h4><blockquote>
Unless the
<b>-o</b>
or
<b>-c</b>
options are in effect, the standard output contains the sorted input.
</blockquote><h4><a name = "tag_001_014_2040">&nbsp;</a>STDERR</h4><blockquote>
Used for diagnostic messages.
A warning message about correcting an incomplete last line
of an input file may be generated, but need not affect the
final exit status.
</blockquote><h4><a name = "tag_001_014_2041">&nbsp;</a>OUTPUT FILES</h4><blockquote>
If the
<b>-o</b>
option is in effect, the sorted input is placed in the file
<i>output</i>.
</blockquote><h4><a name = "tag_001_014_2042">&nbsp;</a>EXTENDED DESCRIPTION</h4><blockquote>
The notation:
<pre>
<code>
-k&nbsp;<i>field_start</i><b>[</b><i>type</i><b>][</b>,<i>field_end</i><b>[</b><i>type</i><b>]]
</b></code>
</pre>
defines a key field that begins at
<i>field_start</i>
and ends at
<i>field_end</i>
inclusive, unless
<i>field_start</i>
falls beyond the end of the line or after
<i>field_end</i>,
in which case the key field is empty.
A missing
<i>field_end</i>
means the last character of the line.
<p>
A field comprises a maximal sequence
of non-separating characters and,
in the absence of option
<b>-t</b>,
any preceding field separator.
<p>
The
<i>field_start</i>
portion of the
<i>keydef</i>
option-argument has the form:
<pre>
<code>
<i>field_number</i><b>[</b>.<i>first_character</i><b>]
</b></code>
</pre>
<p>
Fields and characters within fields are numbered starting with 1.
The
<i>field_number</i>
and
<i>first_character</i>
pieces, interpreted as positive
decimal integers, specify the first character to be used as part of
a sort key.
If
<i>.first_character</i>
is omitted, it refers to the first character of the field.
<p>
The
<i>field_end</i>
portion of the
<i>keydef</i>
option-argument has the form:
<pre>
<code>
<i>field_number</i><b>[</b>.<i>last_character</i><b>]
</b></code>
</pre>
<p>
The
<i>field_number</i>
is as described above for
<i>field_start.</i>
The
<i>last_character</i>
piece, interpreted as a non-negative decimal integer,
specifies the last character to be used as part of the sort key.
If
<i>last_character</i>
evaluates to zero or
<i>.last_character</i>
is omitted, it
refers to the last character of the field specified by
<i>field_number.</i>
<p>
If the
<b>-b</b>
option or
b
type modifier is in effect, characters within a
field are counted from the first
non-blank character
in the field.
(This applies separately to
<i>first_character</i>
and
<i>last_character</i>.)
<p>
The obsolescent options:
<pre>
<code>
<b>[</b>+<i>pos1</i><b>[</b>-<i>pos2</i><b>]]
</b></code>
</pre>
provide functionality equivalent to the
<b>-k</b>&nbsp;<i>keydef</i>
option.
For comparison, the full formats of these options are:
<pre>
<code>
+<i>field0_number</i><b>[</b>.<i>first0_character</i><b>][</b><i>type</i><b>]
    [</b>-<i>field0_number</i><b>[</b>.<i>first0_character</i><b>][</b><i>type</i><b>]]</b>
-k <i>field_number</i><b>[</b>.<i>first_character</i><b>][</b><i>type</i><b>]
    [</b>,<i>field_number</i><b>[</b>.<i>last_character</i><b>][</b><i>type</i><b>]]
</b></code>
</pre>
<p>
In the obsolescent form, fields (specified by
<i>field0_number</i>)
and characters within fields (specified by
<i>first0_character</i>)
are numbered from zero instead of one.
The optional type modifiers are the same in both forms.
If
<i>.first0_character</i>
is omitted or
<i>first0_character</i>
evaluates to zero, it refers to the first character of the field.
The
<b>-b</b>
option does not apply to
<i>-pos2</i>.
<p>
The fully specified
<i>+pos1</i>
<i>-pos2</i>
form with type modifiers
<i>T</i>
and
<i>U</i>:
<pre>
<code>
+<i>w</i>.<i>xT</i> -<i>y</i>.<i>zU</i>
</code>
</pre>
<br>
is equivalent to:
<pre>
<dl compact><dt> <dd>
<table <tr valign=top><th align=left><i>undefined</i>
<th align=left>(<i>z</i>==0 &amp; <i>U</i> contains b &amp; -t is present)
<tr valign=top><td align=left>-k <i>w</i>+1.<i>x</i>+1<i>T</i>,<i>y</i>.0<i>U</i>
<td align=left>(<i>z</i>==0 otherwise)
<tr valign=top><td align=left>-k <i>w</i>+1.<i>x</i>+1<i>T</i>,<i>y</i>+1.<i>zU</i>
<td align=left>(<i>z</i> &gt; 0)
</table>
</dl>
</pre>
<p>
As with the non-obsolescent forms, implementations support at least
nine occurrences of the
<i>+pos1</i>
option, which are significant in command line order.
</blockquote><h4><a name = "tag_001_014_2043">&nbsp;</a>EXIT STATUS</h4><blockquote>
The following exit values are returned:
<dl compact>

<dt>0<dd>All input files were output successfully,
or
<b>-c</b>
was specified and the input file was correctly sorted.

<dt>1<dd>Under the
<b>-c</b>
option, the file was not ordered
as specified, or
if the
<b>-c</b>
and
<b>-u</b>
options were both specified,
two input lines were found with equal keys.

<dt>&gt;1<dd>An error occurred.

</dl>
</blockquote><h4><a name = "tag_001_014_2044">&nbsp;</a>CONSEQUENCES OF ERRORS</h4><blockquote>
Default.
</blockquote><h4><a name = "tag_001_014_2045">&nbsp;</a>APPLICATION USAGE</h4><blockquote>
The default value for
<b>-t</b>,
blank character,
has different properties from, for example,
-t "&lt;space&gt;".
If a line contains:
<pre>
<code>
&lt;space&gt;&lt;space&gt;foo
</code>
</pre>
the following treatment would occur
with default separation as opposed to specifically selecting a
space character:
<br>
<p><table  bordercolor=#000000 border=1 align=center><tr valign=top><th align=center><b>Field</b>
<th align=center><b>Default</b>
<th align=center><b>-t "&lt;space&gt;"</b>
<tr valign=top><td align=left>1
<td align=left>&lt;space&gt;&lt;space&gt;foo
<td align=left><i>empty</i>
<tr valign=top><td align=left>2
<td align=left><i>empty</i>
<td align=left><i>empty</i>
<tr valign=top><td align=left>3
<td align=left><i>empty</i>
<td align=left>foo
</table>
<p>
The leading field separator itself is included in a field when
<b>-t</b>
is not used.
For example, this command returns an exit status
of zero, meaning the input was already sorted:
<pre>
<code>
sort -c -k 2 &lt;&lt;eof
y&lt;tab&gt;b
x&lt;space&gt;a
eof
</code>
</pre>
(assuming that a
tab character
precedes the
space character
in the current collating sequence).
The field separator is not included in a field when it
is explicitly set via
<b>-t</b>.
This is historical practice and allows usage such as:
<pre>
<code>
sort -t "|" -k 2n &lt;&lt;eof
Atlanta|425022|Georgia
Birmingham|284413|Alabama
Columbia|100385|South Carolina
eof
</code>
</pre>
where the second field can be correctly sorted numerically
without regard to the non-numeric field separator.
<p>
The wording in the OPTIONS section clarifies that the
<b>-b</b>,
<b>-d</b>,
<b>-f</b>,
<b>-i</b>,
<b>-n</b>
and
<b>-r</b>
options have to come before the first sort key specified if they are
intended to apply to all specified keys.
The way it is described in this
document matches historical practice, not historical documentation.
In the non-obsolescent versions, the results are unspecified if these options
are specified after a
<b>-k</b>
option.
<p>
The
<b>-f</b>
option might not work as expected in locales where there is not a
one-to-one mapping between an upper- and a lower-case letter.
</blockquote><h4><a name = "tag_001_014_2046">&nbsp;</a>EXAMPLES</h4><blockquote>
In the following examples, non-obsolescent and obsolescent ways of specifying
sort keys are given as an aid to understanding the relationship
between the two forms.
<ol>
<p>
<li>
Either of the following commands sorts the contents of
<b>infile</b>
with the second field as the sort key:
<pre>
<code>
  sort -k 2,2 infile
  sort +1 -2 infile.sE
</code>
</pre>
<p>
<li>
Either of the following commands sorts, in reverse order, the contents of
<b>infile1</b>
and
<b>infile2</b>,
placing the output in
<b>outfile</b>
and using the second
character of the second field as the sort key
(assuming that the first character of the
second field is the field separator):
<pre>
<code>
  sort -r -o outfile -k 2.2,2.2 infile1 infile2
  sort -r -o outfile +1.1 -1.2 infile1 infile2.sE
</code>
</pre>
<p>
<li>
Either of the following commands sorts the contents of
<b>infile1</b>
and
<b>infile2</b>
using the second
non-blank
character of the second
field as the sort key:
<pre>
<code>
  sort -k 2.2b,2.2b infile1 infile2
  sort +1.1b -1.2b  infile1 infile2.sE
</code>
</pre>
<p>
<li>
Either of the following commands prints the System&nbsp;V password file (user
database) sorted by the numeric user ID (the third colon-separated
field):
<pre>
<code>
  sort -t : -k 3,3n /etc/passwd
  sort -t : +2 -3n /etc/passwd.sE
</code>
</pre>
<p>
<li>
Either of the following commands prints the lines of the already sorted
file
<b>infile</b>,
suppressing all but one occurrence of lines having the
same third field:
<pre>
<code>
  sort -um -k 3.1,3.0 infile
  sort -um +2.0 -3.0 infile.sE
</code>
</pre>
<p>
</ol>
</blockquote><h4><a name = "tag_001_014_2047">&nbsp;</a>FUTURE DIRECTIONS</h4><blockquote>
None.
</blockquote><h4><a name = "tag_001_014_2048">&nbsp;</a>SEE ALSO</h4><blockquote>
<i><a href="comm.html">comm</a></i>,
<i><a href="join.html">join</a></i>,
<i><a href="uniq.html">uniq</a></i>,
the <b>XSH</b> specification description of
<i><a href="../xsh/toupper.html">toupper()</a></i>.
</blockquote><hr size=2 noshade>
<center><font size=2>
UNIX &reg; is a registered Trademark of The Open Group.<br>
Copyright &copy; 1997 The Open Group
<br> [ <a href="../index.html">Main Index</a> | <a href="../xshix.html">XSH</a> | <a href="../xcuix.html">XCU</a> | <a href="../xbdix.html">XBD</a> | <a href="../cursesix.html">XCURSES</a> | <a href="../xnsix.html">XNS</a> ]

</font></center><hr size=2 noshade>
</body></html>
